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Abstract 

Explicit functional forms for the generator derivatives of well-known one-parameter 
Archimedean copulas are derived. These derivatives are essential for likelihood 
inference as they appear in the copula density, conditional distribution functions, 
or the Kendall distribution function. They are also required for several asymmetric 
extensions of Archimedean copulas such as Khoudraji-transformed Archimedean 
copulas. Access to the generator derivatives makes maximum-likelihood estimation 
for Archimedean copulas feasible in terms of both precision and run time, even in 
large dimensions. It is shown by simulation that the root mean squared error is 
decreasing in the dimension. This decrease is of the same order as the decrease 
in sample size. Furthermore, confidence intervals for the parameter vector are 
derived. Moreover, extensions to multi-parameter Archimedean families are given. 
All presented methods are implemented in the open-source R package nacopula and 
can thus easily be accessed and studied. 
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1 Introduction 

The well-known class of Archimedean copulas consists of copulas of the form 

C{u) = V^CV'Cni) + • • • + VM), u G [0, 1]^ 

with generator ip. In practical applications, ■0 belongs to a parametric family (V'e)0e0 
whose parameter vector 6 needs to be estimated. 

There are several known approaches for estimating parametric Archimedean copula 



families; see Hofert et al. (2011) for an overview and a comparison of some estimators. 
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In the work at hand, we consider a (semi-)parametric estimation approach based on 
the hkehhood. There are two significant obstacles to overcome. The first one is to derive 
tractable algebraic expressions for the generator derivatives and thus the copula density. 
The second is to evaluate these expressions efficiently in terms of both precision and run 
time. 

Although the density of an Archimedean copula has an explicit form in theory, accessing 
the required derivatives is known to be challenging, especially in large dimensions. 



For example. Berg and Aas (2009) mention that for Archimedean copulas it is not 



straightforward to derive the density in general for all parametric families. For the 
Gumbel family, they say that one has to resort to a computer algebra system, such as 
Mathematica or the function D in R, to derive the d- dimensional density. Note that 
computations based on computer algebra systems often fail already in low dimensions. 
Even if a theoretical formula can be computed, the numerical evaluation of such (typically 
lengthy) formulas is prone to errors since they are not given in a numerically tractable 
form. This often requires to work with a large number of significant digits which is 
typically far too slow to be applied in large-scale simulation studies (for example, to 
access the quality of goodness-of-fit testing procedures). Furthermore, as we will point 
out below, results obtained by computer algebra systems can be unreliable. 



Generator derivatives for some important Archimedean families can be found in Shi 



(1995), Barbe et al. (1996), and Wu et al. (2007), however, in recursive form. In this 



work, we derive explicit formulas for the generator derivatives of well-known Archimedean 
families in any dimension. These derivatives are interesting in their own right, for 
example, for accessing densities, for building conditional distribution functions, or for 
evaluating the Kendall distribution function. They can also be used to explicitly compute 
densities of asymmetric extensions of Archimedean copulas such as Khoudraji-transformed 
Archimedean copulas. 

We then tackle the problem of maximum-likelihood estimation for Archimedean copulas 
for these families. Focus is put on large, say ten to one hundred, dimensions since they are 



the most relevant in practice; see Embrechts and Hofert (2011). Note that the considered 



Gumbel family is also an extreme value copula, for which densities in general are rarely 



known. Hofert et al. (2011) show the excellent performance of the maximum- likelihood 



estimator as measured by both precision and run time in a large-scale comparison with 
various other estimators up to dimension one hundred. Furthermore, to add transparency, 
all the algorithms used in this paper are implemented in the open source R package 
nacopula, so that the interested reader can study the non-trivial details of the numerical 
implementation and the numerous tests conducted in more detail. In the work at hand, 
we also consider examples of multi-parameter Archimedean families. In contrast to 
method-of-moments-like estimation procedures such as the one based on Kendall's tau, 
maximum-likelihood estimation is not limited to the one-parameter case. Furthermore, 
we address the problem of computing initial intervals for the optimization of the log- 
likelihood for the multi-parameter Archimedean families considered. Additionally, we 
show how confidence intervals for the copula parameter vector can be constructed. 

The paper is organized as follows. In Section [2j we briefly recall the notion of 
Archimedean copulas and the families considered. Section [3] presents explicit functional 
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forms of the generator derivatives of these families and the corresponding copula densities 
are derived. In Section [4j the root mean squared error is investigated as a function of the 
dimension. Section [5] presents methods for constructing confidence intervals for the copula 
parameter vector. In Section |6] we address extensions to multi-parameter Archimedean 
families, including a strategy for computing initial intervals and two examples of two- 
parameter families. Finally, Section [7] concludes. 

2 Archimedean copulas 

Definition 2.1 

An (Archimedean) generator is a continuous, decreasing function ip : [0,oo] — >■ [0,1] 
which satisfies ip{0) = 1, ip{oo) = limt_j.oo ^(i) = 0, and which is strictly decreasing on 
[0,inf{t : ijj[t) = 0}]. A d- dimensional copula C is called Archimedean if it permits the 
representation 



C{u) = ^{^-\ui) + ■■■ + i^-^{ud)), u e [0, 1]^ 



(1) 



for some generator ip with inverse tp ^ : [0, 1] — [0, oo], where ^p ^(0) = inf{t : xp{t) = 0}. 



McNeil and Neslehova (2009) show that a generator defines an Archimedean copula if 



and only if ^l> is d-monotone, meaning that ip is continuous on [0, oo], admits derivatives 
up to the order d-2 satisfying {-l)^ ^'ip{t) > for aU A; G {0, . . . , d - 2}, t G (0, oo). 



\d-2 d'^- 



Tipit) is decreasing and convex on (0, oo). 



and (—1 

According to McNeil and Neslehova (20091), an Archimedean copula C admits a density 
c if and only if exists and is absolutely continuous on (0, oo). In this case, c is 

given by 




u e (0,1) 



(2) 



where t{u) = J2'j=iip{uj). 

We mainly assume tp to be completely monotone, meaning that ip is continuous on 
[0,oo] and {-l)^^^p{t) > for all A: e No, t G (0, oo), so that ip is the Laplace- Stieltjes 
transform of a distribution function F on the positive real line, that is, ^ = £5[-F]; see 



Bernstein's Theorem in Feller (1971, p. 439). The class of all such generators is denoted 
by ^oo and it is clear that a, ip £ generates an Archimedean copula in any dimensions 
d and that its density exists. 



There are several well-known parametric generator families; see Nelsen (2007 pp. 116), 
also referred to as Archimedean families. Among the most widely used in applications 
are those of Ali-Mikhail-Haq ("A"), Clayton ("C"), Frank ("F"), Gumbel ("G"), and 
Joe ("J"); see Table [!} We consider these families as working examples throughout this 
work. Detailed information about the corresponding distribution functions F is given in 



Hofert (2011b) and references therein. Note that these one-parameter families can be 



extended to allow for more parameters, for example, via outer power transformations. 
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Table 1 Well-known one-parameter Archimedean generators ^ijj with corresponding distri- 
butions F = CS-^[%Ij]. 



Furthermore, there are Archimedean families which are naturally given by more than a 
single parameter. Examples for both cases are given in Section [6| 

Table |2] summarizes properties concerning Kendall's tau and the tail-dependence 
coefficients; see |Joe] ( |1997i p. 91), |Joe and Hu] ( |1996D , and Nelsen, ( .2007^ p. 214) for 
the investigated Archimedean families. Here, Di{9) = JH t/{exjp{t) — 1) dt/9 denotes the 
Debye function of order one. Note that these properties are often of interest in order to 
choose a suitable model which is then estimated. The construction of initial intervals in 



Section 6.1 for the optimization of the likelihood is based on Kendall's tau. 
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Table 2 Kendall's tau and tail-dependence coefficients. 



3 Maximum-likelihood estimation for Archimedean copulas 
3.1 The pseudo maximum-likelihood estimator 

Assume that we have given realizations Xi, i G {1, . . . , n}, of independent and identically 
distributed ("i.i.d.") random vectors Xi, i G {1, . . . , n}, from a joint distribution function 
H with Archimedean copula C generated by ip and corresponding density c. The generator 
tp is assumed to belong to a parametric family {ipo)e(^B with parameter vector G 6 C W , 
p G N, and the true but unknown vector is 6q (similarly, C = Cqq and c = c^g). As usual, 
random vectors or random variables are denoted by upper-case letters, their realizations 
by lower-case letters. 

Before estimating 6q, the first step is usually to estimate the marginal distribution 
functions. In a second step, one then estimates Oq. This two-step approach is typically 
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0241 much easier to accomplish than estimating the parameters of the marginal distribution 

P2^3 functions and the copula parameter vector simultaneously. Estimating the marginal 

0244 distribution functions can be done either parametrically or non-parametrically. Based 

0246 maximum-likelihood estimation, the former approach is suggested by IJoe and Xu 

0246 

0247 



0260 
0261 



0266 
0267 
0268 
0269 
0260 
0261 
0262 



0264 
0268 
0266 



( 1996 ) and is known as inference functions for margins. The latter approach is known 



0248 as pseudo maximum-likelihood estimation and is suggested by Genest et al. (19951; see 

0249 



Kim et al. (2007) for a comparison of maximum-likelihood estimation, the method of 
inference functions for margins, and pseudo maximum-likelihood estimation. 
0262 Following pseudo maximum-likelihood estimation, the marginal distribution func- 

tions are estimated by their empirical distribution functions Fnj{x) = ^ J2k=i '^{xkj<x}: 
0266 j G {l,...,n}, leading to the so-called pseudo-observations Ui = [un, . . . ,Uid)'^, i G 

{1, . . . , n}, where 



-—Fnj{xi,) = ^, ie{l,...,n}, jG{l,...,4.|^ (3) 




Here, for each j G {1, . . . , d}, rjj denotes the rank of Xij among all x^j, k E {1, . . . , n}. 
0263 The asymptotically negligible scaling factor of n/(n -|- 1) is used to force the variates 

to fall inside the open unit hypercube to avoid problems with density evaluation at the 
boundaries of [0, 1]'^. As usual, the pseudo-observations are interpreted as realizations of 
0267 a random sample from C (despite known issues of this interpretation such as the fact 

that the pseudo-observations are neither realizations of perfectly independent random 

0269 ^ J 

0270 vectors nor that the components are perfectly following a univariate standard uniform 
o^T'i distribution) based on which the copula parameter vector 6q is estimated. 



0272 
0273 
0274 
0275 

0276 Maximum-likelihood estimation is based on the following theory. Given realizations tij, 

0277 
0278 

0279 is taken as Uj, i G {1, . . . ,n}, in ([3])), the likelihood and log-likelihood are defined by 

0280 
0281 



0282 L{G]Ui,. . . ,Un) = ^C0{ui) and l{6]Ui, . . . ,Un) = ^l{0\Ui), 



3.2 Likelihood theory 



« G {1, . . . , n}, of a random sample Ui, i G {1, . . . , n}, from the copula C (in practice, Ui 




0283 
0284 

0286 respectively, where 

0286 

0287 d 
0288 
0289 
0290 
0291 
0292 



1=1 



=logc^(-uO =log((-l)'^#(*«(«))) +EM-(V'e')'K))- 



Here, the subscript 6 of t{u) is used to stress the dependence of t{u) on 0. The maximum- 

0293 likelihood estimator On = On{ui, . . . , Un) can thus be found by solving the optimization 

0294 1 1 

problem 

0296 ^ 
0296 

0297 0„ = argsup/(0; Ml, . . . 

0298 
0299 
0300 



This optimization is typically done numerically. 
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Assuming the derivatives to exist, the score function is defined as 



sg{u) = Vl{e;u) -- 
and the Fisher information is 

I{O)=Ke[se{U)s0{U)'^] =Ee 

for U ^C. 



d_ 



mu),. 



l{e;u) 



^ i{e-u)-^i{e-u 



89. 



«je{i,...,p} 



p. 421), 



Doksum 



Under regularity conditions (see Cox and Hinkley (1974 p. 281), Rohatgi (1976 



pp. 



384) , |Serfiing| ([l980l pp. 144),|Newey and McFaddenj (|1994[ p. 214 6)y ]Schervish| (|199"5 



Lehmann and Caseha (1998 p. 449), van der Vaart ( 2000| pp. 51), Bickel and 



(2000 p. 386), or Davison (2003 p. 118)), the fohowing result holds. 



Theorem 3.1 

(1) (Strong) consistency of maximum-likelihood estimators: 



On = 0„(?7i, . . . , ?7,) A 6>o (n ^ oo). 

du.s. 



(2) Asymptotic normality of maximum-likelihood estimators: 

V^/(0o)'/'(4 - ^o) A N{0,lp), 
where Ip denotes the identity matrix in M^^^. ^ 
3.3 Generator derivatives and copula density 

Applying maximum-likelihood estimation requires an efficient strategy for evaluating the 
(log-)density of the parametric Archimedean copula family to be estimated. The most 
important part is to know how to access the generator derivatives. As mentioned in the 
introduction, this requires to know both a tractable algebraic form of the derivatives and 
a procedure to numerically evaluate the formulas in an efficient way in terms of precision 
and run time. 

As mentioned in the introduction, it is often stated that a computer algebra system 
can be used to access a generator's derivatives. Such an approach has typically two major 
flaws: 

(1) It is not trivial and sometimes not possible for a computer algebra system to find 
derivatives of higher order; 

(2) Even if formulas are obtained, they are usually not provided in a form which is both 
numerically stable and sufficiently fast to evaluate. 

We experienced these flaws when we tried to access the 50th derivative of a Gumbel 
generator ipQ{t) with parameter 6 = 1.25 at t = 15. On a MacBook Pro running Max OS 
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0361 X 10.6.6, we aborted Mathematica 8 after ten minutes without obtaining a result. Maple 

14 lead to the values 10 628, -29 800, and others (without warning) when computing 



0362 
0363 



036S 
0366 



0369 
0370 



0411 
0412 



0364 ■01^25(1^) several times. Note the chaotic behavior of this deterministic problem; the 



nerator derivi 

o 



{-l)''4''\t) = ^Li_d(0exp(-t)), t e deNo, 



values should of course be equal and positive! MATLAB 7.11.0 did return the correct 
0367 value of (roughly) 1057, but failed to access "01^25^(15) (aborted after ten minutes). Let 

us stress that carelessly using such programs in simulations may lead to wrong results. 
Apart from numerical issues, the formulas for the derivatives obtained from computer 
0371 algebra systems can become quite large and thus rather slow to evaluate. They are 

therefore not suitable in large-scale simulation studies, for example, for goodness-of-fit 
0374 tests (or simulations of their performance) involving a parametric bootstrap. 

0376 In the following theorem we derive explicit formulas for the generator derivatives for 
all Archimedean families given in Table [Tl 

0377 " I— I 

Theorem 3.2 

0379 

0380 (1) For the family of Ali-Mikhail-Haq, 

0381 
0382 
0383 
0384 
038S 

0386 where Lis(z) denotes the polylogarithm of order s at z. 

0387 
0388 
0389 
0390 
0391 
0392 

0393 where (d - 1 1/6*)^ = nfe=i(^ + V^') = twT^ denotes the falling factorial. 

0394 y I I 

0395 (3) For the family of Frank, 

0396 

0397 , ,^ \ 

(-l)'^Vr(O = Li-(d-i)((l - e-') exp(-t)), t E (0, oo), d G Nq. 

0399 ^ 
0400 

0401 (4) For the family of Gumbel, 

0402 
0403 
0404 
040S 
0406 

0407 where 

0408 
0409 



(2) For the family of Clayton, 

(-l)'^#(t) = (d-l + l/^),(l + t)-(^+i/^), tE(0,oo), deNo, 




(-i)'^4'^w = ^^'.^.(n, te(o,oo), d 



G N, 



- ^ 7 



k=\ 



:::: = i-ir-' e o-^s{d,mj, a:) = ^ e (^) (7) (-i)^-^ ^ g {i, . . . , ^i, 

0415 j=k ' j = l V-^/ \ / 

0416 

0417 and s and S denote the Stirling numbers of the first kind and the second kind, 

0418 ,. 1 

0419 respectively. 

0420 
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0440 
0441 
0442 



0444 



044r 
0448 
0449 



(5) For the family of Joe, 

0422 

042! (_ud^iM)(f)- ^^^^Pt!)^_pJ ( ^M-t) 

0426 

0*27 where 

0428 
0429 

0432 



, t G (0,00), d€N, 



k=l 



0433 J T{k — a) 



0434 
0435 
0436 

0437 Proof 



aJ,(0) = A:)(A: - 1 - 1/9)^-1 = S{d, k)-^ (, ke{l,...,d}. 

L [1 — a) 



0438 

0439 (1) The generator of the Archimedean family of Ali-Mikhail-Haq is of the form ipg{t) = 

X^fcLi Pfc exp(— /ct), t E [0,00), with probability mass function {pk)kLi as given in 
Table [1} This implies that {-l)'^i;^f\t) = '£'k=iPkk'^ exp{-kt) from which the state- 
04*3 ment easily follows from the definition of the polylogarithm as l^is{z) = J2T=i /^'^ ■ 



0446 (2) The result for Clayton is straightforward to obtain by taking the derivatives. 

0446 



(3) Similar to (1) ^ ^ 



(4) Now consider Gumbel's family. Writing the generator in terms of the exponential 

0460 series and differentiating the summands, leads to ipjf'\t) = J2T=i{~^)^ / kK'^k)dt°'^~'^ , 

0461 J (d) 

0452 where a = 1/6. Since for d G N, {ak)d = J2j=i ^{d, j){ctky , one obtains ipg (t) = 

0453 t-'^Er=i(-*°)'A!E?=is(d,i)(aA;y = t-^j:Ui^'^id,j)j:k=ikK-n''/k\. Note 

0454 / 

0455 that ex p(— x) J2T=() k^x^/k l is the jth exponential polynomial and equals Yl,k=o '^(■?' k) 



0456 .^.fc. ggg 

0457 



Boyadzhiev (2009). With x = — and noting that the summand for = is 



0458 zero, we obtain '(/J^(t) = V'eCi)^ Z]j=i '^"'^('^1 j) Z]fe=i 'S'O, i")'^- Interchanging 

the order of summation leads to i)f\t) = ^Pe{t)t-'^ ELi(-*")^ E?=fc ah{d,j)S{j, k) 

0461 = Yi^^ iafc-d(_i)fe Yfj^i, ah{d, j)S{j, k) from which the result about {-ifil^f^ 

0462 / \ 

0463 directly follows. For the last equality in the statement about a^kK^) iiote that 

IZ = {-ir^'/diEto (t)(-i)'-'E?=oMyKd,j) = (-i)'^Eto (Dil) 

0467 ■( — !)' from which the result follows. 

(5) For Joe's family, (-l)'^V'r(t) = - exp(-t))", d G N, where a = 1/6. 

Letting x = exp(— t), this equals —{x-^y{l — x)"'. The operator x-^ is investi- 



gated in Boyadzhiev 



(2009). It follows from the results there that (-l)''^'^ (*) 



0468 
0469 
0470 
0471 
0472 
0473 
0474 
0475 
0476 
0477 

0478 □ 

0479 

0480 



-Et=iS{d,k){-xr{a)k{l - xr-'^ = -{I - xrY.l=iS{d,k){aU-x/{l - x)f. 
Thus, {-lY^f\t) = a(l - xYY.k=iS{d,k){k - 1 - a)k-i{x/{l - x)f. Resubsti- 
tuting leads to the result as stated. 
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0481 With the notation as in Theorem 3.2 we obtain the following representations for the 

0482 



0492 
0493 
0494 
0495 



0498 
0499 
0500 
0501 
0502 
0503 
0504 
0506 



0520 
0521 
0522 
0523 
0524 
0526 
0526 
0527 
0528 



0534 
0536 
0536 



0539 
0540 



densities of the Archimedean families of Ali-Mikhail-Haq, Clayton, Frank, Gumbel, and 
Joe. 



0483 
0484 
0486 

0486 Corollary 3.3 

0487 
0488 
0489 
0490 



(1) For the family of Ali-Mikhail-Haq, 



0491 Cg{u) = -2 ^ ^Li_d(/le (-U)), 



Where hj{u)=euUT^m^)- 



0496 (2) For the family of Clayton, 

0497 



d-1 . d 

ce{u) = l[{ek + l)(l[uA {l + te{u)) 

k=0 j=i 



y{d+l/e)_ f 



0507 
0508 
0509 
0510 
0511 

0512 (4) For the family of Gumbel, 

0513 
0514 
0516 



(3) For the family of Frank, 



where h^{u) = (1 - e"^)!-"' n?=i(l - expi-Ouj))^^ 



0516 U{uYVG=\Uj 
0517 
0518 

0519 (5) For the family of Joe, 




where /i^H = n,li(l - (1 - %)^). 

Proof ^ ^kk. 

0529 The proof is tedious but straightforward to obtain from Formula (|2]) and the results from 

0530 Theorem [3jl ^^^^^ □ 

0531 V 



0532 The following remarks stress the importance of Theorem 3.2 and Corollary |3.3[ 

0533 



Remark 3.4 -^f^ 

(1) Recursive formulas for the generator derivatives for some Archimedean families were 



0537 presented by Barbe et al. (|1996|) and Wu et al. (|2007|). In contrast. Theorem 3.2 

0538 



provides explicit formulas. As seen from Corollary 3.3 this allows us to explicitly 



compute the densities of the corresponding well-known and widely used Archimedean 
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0541 
0542 
0543 
0544 
054S 
0546 
0547 
0548 
0549 
0560 
0561 
0562 
0563 
0564 
0565 
0566 
0567 
0568 
0569 
0560 
0561 
0562 
0563 
0564 
0566 
0566 
0567 
0568 
0569 
0570 
0571 
0572 
0573 
0574 
0576 
0576 
0577 
0578 
0579 
0580 
0581 
0582 
0583 
0584 
0588 
0586 
0587 
0588 
0589 
0590 
0591 
0592 
0593 
0594 
0596 
0596 
0597 
0598 
0599 
0600 



families, even in large dimensions. Furthermore, it allows us to compute conditional 
distribution functions based on these families and important statistical quantities such 
as the Kendall distribution function, which is of interest, for example, in goodness-of- 



Genest et al. 


(2006 


), 


Genest et al. 


(2009 


1, or 



Among others, note that extreme value copulas rarely have an explicit form of the 
density, the important Gumbel family can now be added to this list. 



(2) The derivatives presented in Theorem 3.2 also play an important role in asymmetric 
extensions of Archimedean copulas. For example, consider a Khoudraji-transformed 
Archimedean copula C, given by 



C{u) = C4u^\...,u''/)U{ul- 



where denotes an Archimedean copula generated by ■0, H denotes the indepen- 
dence copula, and aj £ [0, 1], j G {1, . . . are parameters. Given the generator 
derivatives, the density of a Khoudraji-transformed Archimedean copula is given by 



c{u) 



JC{l,...,d} 



n ^j(^v 



')n(i-« 



This makes maximum likelihood estimation for these copulas feasible; see Hofert and 



Vrins (20111 for an application. 



(3) As pointed out by Hofert (2010b pp. 117), new Archimedean copulas are often 



constructed with simple transformations of the generators addressed in Theorem 3.2 



The results in Theorem 3.2 might therefore carry over to other Archimedean families. 
In fact, one example for such a transformation is the outer power transformation 
addressed in Section |6l 

(4) For an Archimedea n generator with unknown derivatives but known F = CS^^lil)], 
(2011 ) suggested to approximate (— l)'^-!/'^'^'' via 



Hofert et al. 



1 m 

{-iy^^''\t) Vk'expi-Vkt), t G (0, 

m 



oo 



k=l 



where Vk ^ F, k £ {1, . . . ,m}, are realizations of i.i.d. random variables following 
F = CS~^[ip]. In the conducted simulation study, this approximation turned out 
to be quite accurate. Furthermore, it is typically straightforward to implement. 
However, such a Monte Carlo approach is of course slower than having a direct 
formula for the generator derivatives at hand. 

4 Sample size n vs dimension d 



The results of [Hofert et al. (20111 indicate that the root mean squared error ("RMSE") 
is decreasing in the dimension for all other parameters (Archimedean family, dependence 
level measured by Kendall's tau, and sample size) fixed. This may be intuitive for ex- 
changeable copulas since the curse of dimensionality is circumvented by symmetry. In this 
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0606 

o6or 



section we briefly investigate liow tlie RMSE decreases in tlie dimension. Figure [T] shows a 

Pg(j3 clear picture. For fixed Archimedean family (Ali-Mikhail-Haq ("AMH"), Clayton, Frank, 

0604 Gumbel, and Joe), dependence level measured by Kendall's tau (r G {0.25,0.5,0.75}), 

0608 and sample size (n G {20, 50, 100, 200}), the RMSE (estimated based on N = 500 rephca- 
tions) is decreasing in the dimension {d £ {5, 10, 20, 50, 100}). As the log-log plot further 

0608 reveals, the decrease of the RMSE in the dimension d is of the same order as in the 

0609 
0610 
0611 

0612 MSE oc ,. 

0613 nd 

0614 

0615 Although this behavior in the sample size n is well-known, the behavior in the dimension 



sample size n, that is, the mean squared error ("MSE") satisfies 

1 



0616 d is rather impressive since it contradicts the findings of Weif5 (2010), for example. In the 
latter work, conclusions are drawn based on simulations only involving small dimensions. 

0618 ' ° 

0619 In small dimensions, however, numerical problems are often not (regarded) as severe 

0620 as in larger dimensions. Sometimes, they are simply not solved correctly. However, 
Pg22 according to our experience, we believe that the larger the dimension of interest is, the 

0623 more involved numerical issues typically are. This will certainly become more important 

0624 in the future as applications are often high-dimensional. 



5 Constructing confidence intervals 



srvals^^^ 



062S 
0626 
0627 
0628 
0629 

0630 In this section, we describe different ways of how to obtain confidence intervals for the 

copula parameter vector Oq 



5.1 Fisher information 

It follows from Theorem [3rT][(2)] that 



0631 
0632 
0633 
0634 
0635 
0636 
0637 
0638 
0639 
0640 
0641 

0642 This result remains valid if I{9o) is replace by a consistent estimator I{Oo). Therefore, 

0643 
0644 
0646 
0646 
0647 
0648 

0649 where ^^^a (1 — a) denotes the (1 — a)-quantile of the chi-square distribution with p degrees 

0650 of freedom. In the one-parameter case, an asymptotic 1 — a confidence interval for 9q is 

0651 
0652 
0653 



an asymptotic 1 — a confidence region for Oq is given by 

{6» G e : (0n - e)'^nl{0o){en -e)< q^2{l - a)}. 



given by 



0655 
0656 
0657 

0658 TTrlncT-c, — fh^l 

0659 
0660 



0654 1^ ^l-a/2 A 



nl{9) \lnl{e) 

where 2:i_a/2 = <I>^^(1 — Oi/2) denotes the (1 — a/2)-quantile of the standard normal 
distribution function. 
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0661 
0662 
0663 
0664 
0666 
0666 
0667 
0668 
0669 
0670 
0671 
0672 
0673 
0674 
0676 
0676 
0677 
0678 
0679 
0680 
0681 
0682 
0683 
0684 
0686 
0686 
0687 
0688 
0689 
0690 
0691 
0692 
0693 
0694 
0696 
0696 
0697 
0698 
0699 
0700 
0701 
0702 
0703 
0704 
0706 
0706 
0707 
0708 
0709 
0710 
0711 
0712 
0713 
0714 
0716 
0716 
0717 
0718 
0719 
0720 








= 0.5 





































































x = 


-0.75 
























■ 



















































5 -1 










































































































tar". 














1 1 






log(n X d) 



5 6 7 8 9 10 
o n = 20 i n = 50 x n = 100 • n = 200 



Figure 1 log-RMSE (A^ = 500 replications) as a function of the loga- 
rithm of n-d. The plot indicates that the mean squared error 
satisfies MSE oc l/{nd) for all families and dependencies. 
Note that the family of AMH is limited to r € [0, 1/3). 
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0721 
0722 
0723 
0724 
072S 
0726 
0727 
0728 
0729 
0730 
0731 
0732 
0733 
0734 
0735 
0736 
0737 
0738 
0739 
0740 
0741 
0742 
0743 
0744 
074S 
0746 
0747 
0748 
0749 
0760 
0761 
0762 
0763 
0764 
0765 
0766 
0767 
0768 
0769 
0760 
0761 
0762 
0763 
0764 
0765 
0766 
0767 
0768 
0769 
0770 
0771 
0772 
0773 
0774 
0775 
0776 
0777 
0778 
0779 
0780 



For the estimator I{Oq), there are several options, described in what follows. Assuming 
the derivatives to exist, the observed information is defined as 



i=i i=i 



Under regularity conditions (see the references in Section 3.2), the Fisher information 
satisfies 



I{e) = E[J(0; U)] = E[-VV"^/(0; U)] = E 



that is, the Fisher information is the negative Hessian of the score function. From this 
and the definition of the Fisher information, the following choices for I{9o) naturally arise 



(see also Newey and McFadden (1994 pp. 2157) including conditions for consistency): 

(4) 
(5) 





= E, 




1 


n 




1 

n 



'n 

n 



i=l 



n 



(6) 



i=l 



The expected information I{On) is often difficult to obtain. Furthermore, Efron and 



Hinkley (1978) argue for B'^\On) in favor of I (On)- The estimator i^^\d„) is found much 
less in the literature, a reference being [Newey and McFadden] ( [l994l p. 2157). The reason 
why we state it here is that there are cases where the second-order partial derivatives are 
(much) more complicated to access than the first-order ones based on the score function, 
The following proposition provides the score functions for the one-parameter Archime- 
dean families given in Table [TJ 

Proposition 5.1 

(1) For the family of Ali-Mikhail-Haq 




where b^{u) = i^>^y 
(2) For the family of Clayton, 

'o{u) = E - E logu, + 1 log(l + te{u)) -{d + 1/9)-^'^'''^ 

k=0 j=l 



+ te{u) 
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0781 
0782 
0783 
0784 
0786 
0786 
0787 
0788 



0790 
0791 



0830 
0831 
0832 
0833 



0835 
0836 
0837 
0838 
0839 
0840 



(3) For the family of Frank, 



, , d—1 -J^ Uj /J^ Ujexn(—6ui) (d — l)e 



f^^ 1 - exp{-euj) \fr{ 1 - exp{-9uj) 1 - e 

0789 Li_(rf„l)(/1^(-U)) 



0792 (4) For the family of Gumbel, 

^^^^^ ^ d-logCe{u)log{-logCe{u)) _ ^^^^^ _ logCeju) 



0793 
0794 
0795 
0796 
0797 
0798 
0799 
0800 
0801 
0802 

With a9,(0,u) = k{bf{u) - llogte{u))a';^,{e) - {-iy-'EUkjdd,j)S{j,k)9'^. 



where b^iu) = Ejti log(- log n,)V'-i(^Xj)/te(-u) and Q^eJ^) = ELi«i(^,^)^ 

,,.,•4.1, ..\ _ i^ruG/^.\ li„„i /^.\\^G/n\ I i\d-k^d • /j ■\o/^ i,\n-i 



0804 "'^''^^ "'dk\"^ 

(5) For the family of Joe, 



0806 
0807 

/ ^ d-1 v^, , log{l -hi{u)) (1 - i)/l^(-u),,, , 



0810 j = l 

0811 
0812 
0813 
0814 
0815 
0816 
0817 

0818 a 
0819 
0820 
0821 
0822 
0823 
0824 
0825 
0826 
0827 
0828 



Qie,uihi{u)/{l-hi{u))) ^ 
+ Pl,{hiiu)/il-hiiu))) ' ^ 

V 



Proof 



where bi{u) = Y.U ~'"f-\i-l]P^' QJ^^Jx) = Y.i=ia'dk{0.u)x^-^ with 



The proof is quite tedious but straightforward to obtain from Corollary |3.3[ □ 



5.2 Likelihood-based confidence intervals 



Confidence regions or confidence intervals can also be constructed solely based on the 
likelihood function (without requiring its derivatives). For this, the likelihood ratio 
0829 statistic is used, defined as ^ 



W{e; ui, ...,«„) = 2(Z(0„; Ui, . . . , u„) - 1(0; ui, . . . , it„)). 



0834 As Davison ( |2003 p. 126) notes, the likelihood ratio statistic asymptotically follows a 



chi-square distribution, meaning that 



W{eo;Ui,...,Un) (n^oo). 
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Based on this result, an asymptotic 1 — a confidence region for 6q is given by 

0842 

ZZ {0 £ Q : l{e;ui, . . . ,Un) > l{6n]ui, . . . ,Un) - q^2{l - a)/2}. (7) 

0846 

°g*^ If only a sub- vector 6q £ Qpi C M^' of components of 6q = (So""", ^o"'")"'" interest 

0848 (0Q and Gq are referred to as parameters of interest and nuisance parameters, respectively), 

0849 asymptotic confidence region for 6q follows from a similar argument to before, based 
0861 profile log-likelihood 

0862 

0864 lp^{e';Ui,...,Un) =SUJ)l((^\,Ui,...,Unj = I ((^^gi) ; Ui, . . . , u. 



0866 
0866 



0883 
0884 
0886 
0886 



0892 
0893 
0894 



0897 
0898 
0899 
0900 



0" 



0867 where 0"'^' is the maximum- likelihood estimator of Oq given 0^. Under regularity 

conditions, the generalized likelihood ratio statistic 



satisfies 



0868 
0869 
0860 
0861 
0862 
0863 
0864 
0866 
0866 
0867 
0868 
0869 

0870 An asymptotic 1 — a confidence region for 9q is thus given by 

0871 
0872 
0873 
0874 

0876 where 

0876 

osre = argsup Ipi {6'; Mi, . . . , w„). 

0879 9'e0' 

0880 





{e' G Qp, : lpi{0';ui, . . . ,u„) > lp,{d'^;ui, ...,«„)- g 2 (1 - a)/2}, 




0881 This will be used in Section [6] to construct confidence intervals for multi-parameter 

0882 famihes. 



Example 5.2 

The left-hand side of Figure [2] shows the log-likelihood of a Clayton copula based on a 

0887 1 00-dimensional sample of size n = 100 with parameter 9q = 2 such that the corresponding 

0888 bivariate population version of Kendall's tau equals t{9q) = 0.5. The maximum-likelihood 

0890 estimator is denoted by On and the lower and upper endpoints of the likelihood-based 

0891 0.95 confidence interval by O'l'^^ and 0^'^^, respectively. The right-hand side of Figure [2] 
shows the profile likelihood plot for the same sample. Similarly for Figure [3] which shows 
the log-likelihood and profile likelihood plot for the 100-dimensional Gumbel family with 

0896 parameter Oq = 2 such that Kendall's tau equals t{9o) = 0.5. 

0896 
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0901 
0902 
0903 
0904 
0906 
0906 
0907 
0908 
0909 
0910 
0911 
0912 
0913 
0914 
0916 
0916 
0917 
0918 
0919 
0920 
0921 
0922 
0923 
0924 
0926 
0926 
0927 
0928 
0929 
0930 
0931 
0932 
0933 
0934 
0936 
0936 
0937 
0938 
0939 
0940 
0941 
0942 
0943 
0944 
0946 
0946 
0947 
0948 
0949 
0960 
0961 
0962 
0963 
0964 
0966 
0966 
0967 
0968 
0969 
0960 



log-likelihood of a Clayton copula 



Profile likelihood plot for 9 




— I 1 1 n 1 \ — I 

1.94 9?" 1.98 eoe„ 2.02 2.04 9j" 

e 

n = 100 rf = 100 9o = 2 t(9o) = 0.5 




1.94 1.96 1.98 eoe„ 2.02 2.04 2.06 
n = 100^^100 M = 2 T(eo) = 0.5 



Figure 



log-likelihood of a Gumbel copula 



leter 6 
lelih^^ plot 



2 Plot of the log-likelihood of a Clayton copi]j^|||t) b^ed on a sample of size 
n = 100 in dimension d = 100 with parameter = 3»uch that Kendall's tau 
equals 0.5. Corresponding profile likelil 



plot (right). 



Profile likelihood plot for 9 



- 5764 




9 

„ = 100 d = 100 9o = 2 x(9o) = 0.5 



1 1 1 1 — I 1 1 1 r 

1.97 1.98 1.99 9o 9„ 2.02 2.03 2.04 2.05 
9 

„ = 100 d = 100 9o = 2 x(9o) = 0.5 



Figure 3 Plot of the log-likelihood of a Gumbel copula (left) based on a sample of size 
n = 100 in dimension d = 100 with parameter Oq = 2 such that Kendall's tau 
equals 0.5. Corresponding profile likelihood plot (right). 
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0961 
0962 
0963 
0964 
096S 



5.3 A simulation study to access the coverage probability 



In this section, we compare the different approaches for obtaining (asymptotic) con- 
fidence regions and intervals. For this, we conduct a simulation study to access the 
0966 coverage probability. The methods for obtaining confidence intervals based on the Fisher 

information are denoted by for Q, "/(^H^n)" for Q, and for g; the 



0968 
0969 



0998 
0999 
1000 



1017 



likelihood- based approach ([7| by 'W. 



0970 As can be seen from Proposition 5.1 already the score functions can be quite com- 

0971 



plicated. In order to be able to investigate the method i^'^\9n) based on the observed 
)n, we only consider the Clayton family, for which 



0972 

0973 information, we only consider the Clayton family, for which 

0974 
0975 
0976 
0977 
0978 
0979 
0980 
0981 
0982 
0983 
0984 
098S 

0986 dimensions d G {5, 20} for the dependencies r G {0.25, 0.5, 0.75}. For each of these setups 

0987 
0988 




{d+l/9) 



r 



l + te{u) 

with tg('u) = ^t0{u) = J2j^i{— log Uj)u~^ , that is, for which l^'^\0n) can be easily 
computed. Our simulation study is based on the sample sizes n £ {100,400} in the 



and each of the methods I{6n), I^^\6n), i^'^\6n), and W, we determine the proportion 
0989 of cases among N = 1000 replications for which the true parameter is contained in the 

computed confidence interval. Since the expected information is not known explicitly, we 

0991 

Qgg2 evaluate it by a Monte Carlo simulation based on samples of size 10 000. 

0993 Table |3] shows the results of the conducted simulation study. Overall, all methods 

work comparably well. Note that from a computational point of view, I^^\On) is 

0996 preferred to I{9n) if the latter has to be evaluated based on a Monte Carlo simulation. 

0997 Furthermore, I^'^\6n) is typically difficult to evaluate, due to the complicated second order 
derivatives; the tractable Clayton family is certainly an exception. Even I^^\On) may be 



(numerically) challenging for some families, as Proposition 5.1 indicates. The likelihood 
1001 based approach W has several advantages. First, it is typically even simpler to evaluate 

1°°^ than I^^\9n)- Second, it may lead to asymmetric confidence intervals. Finally, by using 

1004 a re-parameterization, it allows one to construct confidence intervals for quantities such 

loos as Kendall's tau or the tail-dependence coefficients (otherwise often obtained from the 

1007 Delta Method based on the approximate normal distribution). 

1008 
1009 
1010 
1011 
1012 
1013 



6 Multi-parameter families 



The one-parameter generators of Ali-Mikhail-Haq, Clayton, Frank, Gumbel, and Joe can 
1014 easily be extended to allow for more parameters, for example, by so-called outer power 



1018 transformations or even more general generator transformations; see Hofert (2010a) 

1016 



Hofert 


(2010b 


1, or 


Hofert ( 


2011a 



1018 copula and the Archimedean GIG family and apply maximum-likelihood estimation 

1019 fop estimating the copula parameters. Both of these families are available via the R 

1020 
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1021 
1022 
1023 
1024 
1026 
1026 
1027 
1028 
1029 
1030 
1031 
1032 
1033 
1034 
1036 
1036 
1037 
1038 
1039 
1040 
1041 
1042 
1043 
1044 
1046 
1046 
1047 
1048 
1049 
1050 
1051 
1052 
1053 
1054 
1056 
1056 
1057 
1058 
1059 
1060 
1061 
1062 
1063 
1064 
1066 
1066 
1067 
1068 
1069 
1070 
1071 
1072 
1073 
1074 
1076 
1076 
1077 
1078 
1079 
1080 



Coverage probabilities for Clayton (in %) 

1 — a n T d 



Method for obtaining confidence intervals 




0.995 



96.0 
94.9 
94.0 
95.8 
95.6 
95.9 



95.2 



^^^^ 



99.1 
98.9 
98.4 
99.3 
99.2 
98.8 

98.8 
99.3 
99.0 
99.6 
98.7 
99.1 

99.7 
99.5 
99.4 
99.9 

99.5 
99.6 

99.3 
99.6 
99.2 
99.9 
99.3 
99.8 



95.6 
94.9 
94.0 
95.8 
95.7 
95.9 

95.1 
96.0 
95.0 
95.2 
94.9 
94.7 

99.1 
98.9 
98.4 
99.3 
99.2 
98.8 

98.8 
99.3 
99.0 
99.6 
98.7 
99.1 

99.5 

99.5 
99.3 
99.9 
99.5 
99.5 

99.3 
99.6 
99.2 
99.9 
99.2 
99.7 



Table 3 Simulated 
replications. 



age probabilities for Clayton's family based on N 



1000 
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1081 
1082 
1083 
1084 
1086 
1086 
1087 
1088 
1089 
1090 
1091 
1092 
1093 
1094 
1095 
1096 
1097 
1098 
1099 
1100 
1101 
1102 
1103 
1104 
llOS 
1106 
1107 
1108 
1109 
1110 
1111 
1112 
1113 
1114 
1115 
1116 
1117 
1118 
1119 
1120 
1121 
1122 
1123 
1124 
1125 
1126 
1127 
1128 
1129 
1130 
1131 
1132 
1133 
1134 
1135 
1136 
1137 
1138 
1139 
1140 



package nacopula so that the interested reader can easily fohow our calculations. The 
computations carried out in this section were run on a Mac mini under Mac OS X Version 
10.6.6 with a 2.66 GHz Intel Core 2 Duo processor and 4 GB 1067 MHz DDR3 memory. 
The R version used is 2.12.1. 

6.1 Finding initial intervals 

Maximizing the log-likelihood / is typically achieved by a numerical routine. These 
algorithms often require an initial interval (or an initial value, which can be derived from 
the former). This interval should be sufficiently large in order to contain the optimum, 
but also sufficiently small in order to find the optimum fast. Furthermore, one should be 
able to compute an initial interval in a small amount of time in comparison to the actual 
log-likelihood evaluations required for maximizing the log-likelihood. 

For Archimedean families with ipg G ^'ooj the measure of concordance Kendall's tan is a 



function in 6 which always maps to the unit interval; see, for example, Hofert (2010b pp. 



59). It thus provides an intuitive "distance" in terms of concordance. For one-parameter 
families, one can thus typically choose an initial interval of the form 

[r"^(max{f - h,Ti}),T'^{mm{T + /i, r„})], 

where h € [0, 1] is suitably chosen with intuitive interpretation as "distance in concordance" 
and Ti and t„ denote lower and upper admissible Kendall's tan for the families considered 



(in Example 5.2 we used this technique to find an interval on which the log-likelihood 
is plotted; we took f as the correct value r = 0.5, and used h = 0.01 and h = 0.015 for 
Clayton's and Gumbel's family, respectively). If the dimension is not too large, one can 
take the mean of pairwise sample versions of Kendall's tau as estimator f of Kendall's 



tau; see Berg (2009), Kojadinovic and Yan (2010), and Savu and Trede (20101 for this 



estimator. Another option is a multivariate version of Kendall's tau; see Jaworski et al. 
(2010 pp. 217). A fast way, especially in large dimensions, is to utilize the explicit 
diagonal maximum-likelihood estimator ^ 

logd i 



logn-log(Er=i-log>^i) 



where K- 



max 

J6{l,...,d} 



Uij, i £ {1, . . . ,n}. 



for Gumbel's family, see Hofert et al. (20111, and estimate Kendall's tau by r*^(6'^), 
where t^{0) = {9 — l)/6 denotes Kendall's tau for Gumbel's family as a function in 
the parameter. Since the optimization for one-parameter families is typically not too 
time-consuming, one can also just maximize the log-likelihood on a reasonably large, fixed 
interval, for example [r~^(/ii), r~^(/i2)], where hi and /i2 are suitably chosen constants 
in the range of r; see [Hofert et al.] ( |20TT] ). 

For multi-parameter Archimedean families, the log-likelihood is typically even more 
challenging to evaluate. An initial interval therefore also serves the purpose of reducing 
the parameter space to an area where the log-likelihood can be evaluate without numerical 
problems. The idea we present here to construct initial intervals for multi-parameter 
families is again based on Kendall's tau. In a first step, we estimate Kendall's tau 
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1198 
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by f„. To this end we apply the pairwise Kendah's tau esthiiator, which, due to the 
rather complicated log-likelihood evaluations does not take too much run time for the 
ten-dimensional examples considered below; another option would be to randomly select 
sub-columns of the data and apply the pairwise Kendall's tau estimator to this sub-data 
in order to reduce run time. Based on this estimator of Kendall's tau, we then construct 
an initial rectangle by three points. These points are determined via T~^{fn — h-) and 
■'"^^(^n + ^+)) that is, via certain positive numbers h- and /i+ (sufficiently small to ensure 
that Tn — h- and f„ -|- /i+ are in the range of admissible Kendall's tau). They allow for an 
intuitive interpretation as "distance in (terms of) concordance" and are independent of 
the parameterization of the family (since they measure distances in Kendall's tau and not 
in the underlying copula parameters). Now note that is not uniquely defined for two- 
or more-parameter families. It is, however, if one fixes all but one parameter. By starting 
with one corner of the initial rectangle to be constructed and applying monotonicity 
properties of r as a function in its parameters, one can thus construct an initial rectangle 



around the estimate of t{Oq). More details are given in Sections 6.2 and 6.3 for the 
two-parameter Archimedean families investigated. 



Sections 6.2 



6.2 Outer power copulas 

If ^ G \I'oo, so is ijj{t) = 'tp{t^^^) for all f3 £ [l,oo), since the composition of a completely 
monotone function with a non-negative function that has a completely monotone derivative 



is again completely monotone; see Feller (1971, p. 441). The copula family generated by 



tp is referred to as outer power family. 

The generator derivatives of 'ip{t) = 
derivatives of compositions which dates back at least to Schlomilch (1846). According to 
this formula. 



can be accessed with a formula about 



{-lf^(d)(^t) = P°P(t^/^)/t'^, den, where P°p(x) = ^ a%{/3) {-!)'' ip'^^\x)x'' 



k=l 



Via ([2]) and the form of given in Theorem 
density of an outer power copula. 



3.2 



(4) one can thus easily derive the 



For sampling 1/ ~ F = CS ^['ijj], Hofert (201 la I derived the stochastic representation 
V = SV^ S~S(l//3,l,cos'3(^/(2/3)),l|;3=i};l), V ^ F = CS'^[i;]. 

Note that V can easily be sampled via the R package nacopula for all given in Table [l] 
We consider the case where ip is Clayton's generator, so we obtain the two-parameter 
outer power Clayton copula with generator ^{t) = (1 -|- t^^^)~^^^. This copula, which 



generalizes the Clayton family, was successfully applied in Hofert and Scherer (20111 in 



the context of pricing collateralized debt obligations. For this copula, Kendall's tau and 
the tail-dependence coefficients are given explicitly by 



T = r(e,/3) = 1 



(3{e + 2)'- 



Ar 



(8) 
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Note the possibility to have upper tail dependence for this copula, which is not possible 
for a Clayton copula. 

The following algorithm describes a procedure for finding an initial interval for outer 
power Clayton copulas. The algorithm can easily be adapted to other outer power 
copulas, given that the base family (the family generated by ip) is positively ordered in 
its parameter and admits a sufficiently large range of Kendall's tau. 

Algorithm 6.1 

(1) Choose > 0, and e > 0. 

(2) Let the smallest (3 be denoted by /?/ = 1. 

(3) Solve t{9u, A) = min{fn + 1 — e} with respect to 9u- 

(4) Solve t{9i,(3i) = max{-f„ — h-,e} with respect to 6*;. 

(5) Solve t{9i, Pu) = minify + h+,1 — e} with respect to 

(6) Return the initial interval / = [{9i, Pi)'^, {9^, Pu)'^]- 




The idea behind Algorithm 6.1 is to construct an initial rectangle by three points. First, 
the lower-right endpoint of the rectangle is constructed. Since t{9,(3) is an increasing 
function in both 9 and /?, the largest 9 and the smallest (3, that is, {9u, A)""", are chosen such 
that Kendall's tau equals f„ plus a small "distance in concordance" > to ensure that 
9u is indeed an upper bound for 9. The truncation done by e > is to obtain an admissible 
Kendall's tau range. Second, the lower-left endpoint is found. The monotonicity of r 
justifies determining the minimal value 9i for 9 such that T(9i,f3i) = max{f„ — h-,e}, 
where /i_ > is suitably chosen, similar to h^. In the third and final step, the upper- left 
endpoint of the initial rectangle is determined. The maximal value f3u for /3 is determined 
in a similar fashion to the first step. Note that all equations can be solved explicitly due 
to the explicit form of Kendall's tau as given in ([s]). 

To access the performance of the maximum-likelihood estimator, we generate = 1000 
times n = 100 realizations of i.i.d. random vectors following d-dimensional outer power 
Clayton copulas. For demonstration purposes, we consider d = 10. Furthermore, we 
consider three setups of dependencies: = {9, /?)"'" = (1/3, 8/7)"'" resulting in a Kendall's 
tau of 0.25; 6 = (1,4/3)""" with corresponding Kendall's tau equal to 0.5; and 9 = (2,2)""" 



with Kendall's tau equal to 0.75. For finding initial intervals. Algorithm 6.1 is applied 
with e = 0.005, h- = 0.4, and h+ = 0. The results are summarized in Table |4j where 
"RMSE" denotes the root mean squared error as before and "MUT" denotes the mean 
user time (in seconds). 

Figure |4] shows a wire- frame plot (left) of the negative log- likelihood of a sample of size 
n = 100 for the setup 6 = (1,4/3)""" (r = 0.5) and the corresponding level plot (right). 
Both plots have the initial interval determined by Algorithm |6.1| as domain and show 
both the true value Oq = {9o, /^o)"*" and the optimum 0„ = {9n, $nY ^ determined by the 
optimizer. Figure [5] shows profile likelihood plots for the two parameters 9 and /3. 
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1287 
1288 
1289 
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1292 
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1296 
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1299 
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1306 
1306 
1307 
1308 
1309 
1310 
1311 
1312 
1313 
1314 
1316 
1316 
1317 
1318 
1319 
1320 



/3n 



n 


r 


9 


Bias 


RMSE 


/3 


Bias 


RMSE 


# 


MUT 


100 


0.25 


0.3333 


0.0073 


0.0609 


1.1429 


-0.0017 


0.0429 


42 


0.7s 


100 


0.5 


1.0000 


0.0082 


0.1050 


1.3333 


-0.0003 


0.0613 


39 


0.6s 


100 


0.75 


2.0000 


0.0107 


0.1786 


2.0000 


-0.0012 


0.1088 


47 


0.7s 


500 


0.25 


0.3333 


0.0025 


0.0276 


1.1429 


-0.0014 


0.0189 


42 


1.2s 


500 


0.5 


1.0000 


0.0026 


0.0451 


1.3333 


-0.0013 


ll^|68 38 


1.1s 


500 


0.75 


2.0000 


0.0031 


0.0753 


2.0000 


-0.0013 


0.o9^49 


1.3s 



Table 4 Summary statistics for estimating two-parameter outer power Clayton copulas. 



-log-likelihood of an outer power Clayton copula 

-I- (00, Po) 

X (e„.P„r 





Iter power Clayton copula 




n = m 



ations: 9 



n = 100 d = lO x(9o, Po) = 0.5 # evaluations: 9 



Figure 4 A -wsui^rame plot (left) and corresponding level plot (right) of the negative 
log- likelihood j^ction for an outer power Clayton copula for a sample of size 
n = 100 ifi^O = (1,4/3)^ (r = 0.5) with the computed initial interval as 
domain. 



22 



6 Multi-parameter families 



1321 
1322 
1323 
1324 
132S 
1326 
1327 
1328 
1329 
1330 
1331 
1332 
1333 
1334 
1335 
1336 
1337 
1338 
1339 
1340 
1341 
1342 
1343 
1344 
134S 
1346 
1347 
1348 
1349 
1360 
1361 
1362 
1363 
1364 
1365 
1366 
1367 
1368 
1369 
1360 
1361 
1362 
1363 
1364 
1365 
1366 
1367 
1368 
1369 
1370 
1371 
1372 
1373 
1374 
1375 
1376 
1377 
1378 
1379 
1380 



Profile likelihood plot for 9 
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Figure 5 Profile likelihood plot for 6 (left) and /? (right). M 
6.3 The GIG family 

An Archimedean family which naturally allows for two parameters can be constructed 
as follows. We start with the density of a generalized inverse Gaussian distribution 
GIG(z^,(?:),x), given by "^Br 

g{x; v, (/), x) = or^ i nr\ ^'' exp(-(x/a; + 0a:)/2), x e (0, oo). 
2K^{^4>X) 

Here, G R with: (/> G [0, oo), x ^ (0, cxd), if G (— oo, 0); </), x S (0, oo), if z^ = 0; and 



G (0,oo), X e [0,oo), if G (0,cx)); see McNeil et al. (2005, p. 497). The function 



Ku{t) = cosh(z^x) exp(— t cosh(x)) dx denotes the modified Bessel function of the third 
kind with parameter u. It is decreasing in t and symmetric about zero in v. Furthermore, 
it is increasing in if z^ G (0, oo). Another important property is 



lim f'KJt) = l^-'Tiu) if 1/ G (0, oo). 



(9) 



Note that for a. ^^J £ ^oo, the generator V'(ci) generates the same Archimedean copula 
as ^{t) for all c G (0, oo). Letting c = 0/2 and 6 = \f^x leads to a comparably simple 
form of the generator of an Archimedean GIG copula with parameter vector = (zv, 0)^, 
given by 



^{t) = {l + tyl'^Ky{e^/l + t)/K^{e), iG[0,oo), z^gM, ^g(0,oo) 



(10) 



If we let 



h 



it) 



{eVi + ty^K^,{9VT+t) 



vi, zy2 G M, 6* G (0, oo), t G [0, oo) 
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1381 
1382 
1383 



one obtains from ( [9| that limg\^o ^vi,i/2 

g(t) = 2''^-''^T{i^i)/r{u2) for every 1^1,1^2 € (0,oo) 
and t £ [0,00). Since V can be written as ^(t) = (1 + t)~'^h,^^^^g{t), one obtains as 
1384 limiting case a r(z^, 1) distribution for 6* \ if 1/ G (0, 00), that is, a Clayton copula with 

^^^^ parameter 

i3g7 The density / of F = CS'^l'ip] is given by 

1388 {2xY-^ 
1390 jy-,-,-/- 9''K^{9) 

1391 

1392 so that V = X/2 ~ F with X ~ 010(1^, 1, 6*2). The GIG distribution can easily 

1393 sampled with the R package Runuran. For numerically computing note that 



1394 
1395 



1414 
1415 



1417 



1418 



1420 



1439 
1440 



/(^; ^' ^) = exp(-(0V(2x) + 2x)/2), X G (0, 00 



Ky{y) / Ky{x) < exp(x — y){y/xY for all v G (—1/2, 00) and < x < y < cxd (see Paris 



1396 (1984)), so that [0, (1 — log(t)/^)^ — 1] is an initial interval for searching if) ^{t) for all 

v G (-1/2,00). 

Kendall's tau and the coefficients of tail dependence of a GIG copula are given by 



1397 
1398 
1399 

1400 /-CO jit^^ 

t{v, 9) = i- t{eK+^^e{t)/K,{e)f dt, Tr= Xu = 



1401 

1402 Jo 



where K+k,e{t) = K,y+k{OVT^) /VT^''^'' , t G [0,oo), 6 G (0,oo), and u,keR. For 



1403 
1404 

1406 computing the tail dependence coefficients, consider ip' (2t) / ijj' (t) for the limits t ^ 

and i t CO, and use K^{t) ^7r/(2t) exp(— t) (valid for t 3> — 1/4|) in the latter 

1408 case. A numerically stable evaluation of the integral formula for Kendall's tau for small 

1409 1/ > based on the Clayton limit is given in the R package nacopula. Note that as 
^^^^ numerical results indicate, Kendall's tau is decreasing in both u and 9 v £ [0, 00); see 

1412 Figurejo] (left). If G (0,oo), the limit for Kendall's tau as ^ \ is 1/(1 + 2i>) which 

1413 equals Kendall's tau for Clayton's family with parameter Figure [g] (right) shows a 
scatter plot of 1000 bivariate vectors of random variates following a GIG family with 

1416 = (0.05, 0.0968)""" with corresponding Kendall's tau equal to 0.5. 

One advantage of the GIG family is that the generator derivatives take on a comparably 



1419 simple form, which can be represented as 



izi (-i)v^)(o = ^S^frl^' ' ^ (°'°°^' ^ ^ 

1423 

1424 This can be easily derived by differentiating ^ under the integral sign and interpreting 

1425 the resulting integrand as the density of a GIG(i^ + d, 2(1 + t), 9^/2) distribution which 
1*^^ integrates to one. Via (j2]), one then easily finds the form of the log-density of a GIG 
1428 family, given by 



M31 logc(w) = logh^^dfiMu)) + {d- l)log K,y{9) - ^logK+i^g{ip~'^{uj)), 

1432 ■ MM - Jl j=l 

1433 M ^ 

^^33 = -{'^ + d) log(l + tg{u)) + log h,+d,u,eM'^)) + + 1) E + V'"H%)) 

1436 

1437 d 



j=i 
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Kendall's tau for GIG copulas 
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Figure 6 Kendall's tau for the GIG family as a function in 9 for given (left). A 
sample of 1000 bivariate vectors of random variates following a GIG family 
with = (0.05, 0.0968)"'" and corresponding Kendall's tau equal to 0.5 (right). 

The following algorithm describes a procedure for finding an initial interval for GIG 
copulas with i/ £ [0, oo). The idea behind this algorithm is similarly to those of Algorithm 



6.1 However, it takes into account that r(i/, 9) is decreasing in both parameters G [0, oo) 
and 9 G (0,oo) and thus first determines the upper-left, then the lower-left, and finally 
the lower-right endpoint of the initial rectangle. As before, denotes an estimator of 
Kendall's tau, taken as the pairwise Kendall's tau estimator. Note that for = = 0, 
the range of Kendall's tau as a function in 9 is (0, 1]; see Figure [6] (left). Furthermore, 
for 9 = 9i sufficiently small, the range of Kendall's tau as a function in v is (0, 1). 



1 



Algorithm 6.2 

(1) Choose h-,h^ > 0, and e > 0. 

(2) Let the smallest u be denoted by vi = 0. 

(3) Solve t{i'i,9u) = max{f:„ — h-,e} with respect to 9u- 

(4) Solve t{i^i,9i) = minjfn -|- 1 — e} with respect to 9i. 

(5) Solve t{uu, 9i) = max{fn — e} with respect to 

(6) Return the initial interval / = [{ui, 9i)^, {v^, 9y_)'^]- 



We generate = 200 times n = 100 realizations of i.i.d. random vectors in d = 10 
dimensions following GIG copulas with parameters 9 = {u, 9)^ = (0.1, 0.SSSS)""" resulting 
in a Kendall's tau of 0.25, = (0.05, 0.0968)"'" with corresponding Kendall's tau equal to 
0.5, and 6 = (0.01, 0.0012)"'" with Kendah's tau equal to 0.75 (the choice of N is made 
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1531 
1B32 



1535 
1536 



1539 



1550 
1561 



1501 solely to the larger run time for this family). For finding initial intervals, Algorithm |6.2| 

^g(j3 is applied with e = 0.005 and = h+ = 0.15. The results are summarized in Table [5} 

1504 Note that especially under weak concordance, the GIG family requires a larger sample 
size n in order for the bias to be small. 

1506 

i5or ~ I 

1508 i'n (^n 

1509 
1510 
1511 
1512 
1513 
1514 
1516 
1516 
1517 
1518 
1519 
1520 
1521 
1522 
1523 
1524 
1526 
1526 
1527 
1528 

1529 We presented explicit functional forms for the generator derivatives of well-known Ar- 

1530 chimedean copulas. These explicit formulas are of interest for several reasons. Apart 
from being able to express various important quantities such as conditional distributions 

1533 or the Kendall distribution function explicitly, the generator derivatives allow us to 

^^^^ apply maximum-likelihood estimation for estimating the parameter vectors of various 

Archimedean copulas, even in large dimensions such as d = 100. The excellent perfor- 
1537 mance in terms of both precision and run time of maximum likelihood estimation was 

1538 



n 


r 


V 


Bias 


RMSE 


e 


Bias 


RMSE 


# 


MUT 


100 


0.25 


0.1000 


0.1622 


0.3769 


0.8333 


-0.0178 


0.1395 


57 


12.3s 


100 


0.5 


0.0500 


0.0065 


0.0538 


0.0968 


0.0002 


0.0134 


55 


12.7s 


100 


0.75 


0.0100 


0.0198 


0.0541 


0.0012 


-0.0001 


0.0005 


93 


39.3 s 


500 


0.25 


0.1000 


0.0480 


0.1680 


0.8333 


-0.0003 


0.0562 


63 


67.0 s 


500 


0.5 


0.0500 


0.0034 


0.0295 


0.0968 


0.0003 


0.0063 


53 


60.4 s 


500 


0.75 


0.0100 


0.0057 


0.0286 


0.0012 


-0.0000 


0.0003 


102 


168.3 s 



Table 5 Summary statistics for estimating two-parameter GIG copulas. 
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shown in Hofert et al. (2011 1 in a large-scale comparison with various other estimators 

1540 up to dimension d = 100. In the present work, we presented the theoretical details 

1541 and showed that maximum-likelihood estimation is also feasible for multi-parameter 

1543 Archimedean families. Furthermore, we showed that the mean squared error MSE is 

1544 decreasing in the dimension d and that this decrease is of the same order as the decrease 
1546 ^Y^Q sample size n, that is, MSE oc l/{nd). We also constructed initial intervals for 
^g^^ the likelihood optimization. Moreover, we obtained likelihood-based confidence intervals 

1548 for the parameter vector and compared them to information-based confidence intervals 

1549 £qj, Clayton family where the Fisher information is comparably easy to compute. A 
transparent implementation of the presented results is given in the open source R package 

1552 nacopula, so that the interested reader can easily follow our calculations. 

1553 
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